In. 


THE    NORMAL  SGHOOL 
QUARTERLY 

Series  14  January,  1916  Number  58 


Standards  Employd  in  the 
Determination  of  Teach- 
ing Efficiency 


By 


EDWIN  A.  TURNER 


PUBLISHT   JANUARY,   APRIL,   JULY,    AND  OCTOBER    OF   EACH 

YEAR  BY  THE  ILLINOIS  STATE  NORMAL  UNIVERSITY, 

NORMAL,  ILLINOIS 

Enterd  August,   1902,  at  Normal,  Illinois,  as  second-class  mail  matter 
under  Act  of  Congress  of  July  16,  1894 

N.  B — Any  teacher  in  Illinois  may  get  the  Normal  School  Quarterly  regularly 
by  sending  exact  name  and  address,  and  by  giving  prompt  notls  of  any  change  of 
address. 

Simplified  spellings  ar  used  in  the  offlsial  publications  of  the  Illinois  State 
Normal  University. 


UNIVS  RSfTY  OF  ILLINOIS  LIBRARY 


nm  3  3  1917 

Normal  School  Quarterly 


Publisht 

by    the    Illinois 

State    Normal    University, 

Normal, 

Illinois 

Series  14 

JANUARY.  1916 

No.  58 

STANDARDS   EMPLOYD   IN   THE   DETERM- 
INATION OF   TEACHING  EFFICIENCY 

By 
Edwin  A.  Turner 

At  present  our  pedagogical  literature  bristles  with  the  term 
efficiency.  Even  writers  of  ability  use  it  extravagantly.  The 
term  itself  seems  to  satisfy.  It  suggests  the  shop,  the  factory, 
and  the  salesroom,  where  performances  are  judged  in  terms  of 
the  concrete  and  where  definit  standards  are  blockt  out  in  open 
competition.  It  apparently  pacifies  the  longing  for  scientific  ac- 
curacy and  generates  a  feeling  of  confidence  in  him  who  sets  it 
up  for  his  goal. 

Unfortunately  the  teaching  profession  in  the  main  has  adopt- 
ed efficiency  as  its  slogan  without  making  adequate  provision  for 
determining  when  it  is  attaind.  Until  the  spokesmen  for  the 
profession  can  in  a  very  simple  and  in  a  very  practical  way  point 
out  the  meaning  of  efficiency  as  it  relates  to  specific  attainment 
and  can  give  explicit  directions  for  determining  the  degree  of 
efficiency  of  this  or  that  sort  of  teaching,  the  term  efficiency 
must  be  considerd  more  or  less  platitudinous. 

In  the  industries  the  ability  of  the  performer  is  easily  mes- 
ured,  since  the  products  of  his  labor  are  objectiv,  concrete,  and 
redily  subjected  to  comparativ  tests.  The  efficiency  of  the  black- 
smith is  mesured  by  the  length  of  time  the  shoe  clings  to  the 
hoof  and  by  the  degree  of  comfort  it  gives  the  horse.  The  effi- 
ciency of  a  dentist  is  mesured  by  the  length  of  time  the  filling 
remains  in  order  or  by  the  permanency  and  comfort  of  the  bridge 
he  has  made.  The  efficiency  of  a  gardener  is  determind  by  the 
number  and  quality  of  vegetables  produced  per  unit  of  area.  In 
any  case  when  the  result  is  better  than  that  ordinarily  produced 
the  performer  is  thought  of  as  having  superior  ability  and  conse- 
quently he  is  considerd  efficient. 


Subjectivly  considerd,  efficiency  is  the  ability  to  produce  su- 
perlativ  results  consistently.  The  median  or  average  of  a  number 
of  such  abilities  is  a  desirable  standard  to  use  in  an  endevor  to 
determin  the  merit  of  individual  performances.  In  the  industrial 
and  scientific  fields  such  standards  are  well  known.  In  the 
teaching  profession  we  have  just  begun  to  use  them  advantageous- 
ly. We  cannot  hope  to  attain  efficiency  until  we  are  able  to 
determin  when  it  is  attaind. 

With  the  single  exception  of  the  minimum  knowledge  re- 
quirement, which  is  generally  provided  by  law,  there  is  no  other 
legally  accepted  standard  for  judging  the  ability  of  teachers.  The 
wide  and  varied  use  of  standards  employd  in  determining  the 
ability  of  teachers  is  notorious. 

The  far-reaching  significance  of  the  conditions  resulting 
from  the  application  of  dissimilar  standards  is  beyond  the  com- 
prehension of  those  who  evaluate  the  teaching  process  in  terms 
of  local  and  personal  standards.  There  is  not  a  little  evidence  to 
substantiate  the  opinion  that  subnormality,  retardation,  disin- 
terestedness, disobedience,  and  withdrawals  from  school  are  the 
direct  result  of  the  inadequate  standards  held  by  administrators 
and  teachers.  Until  some  of  the  standards  now  employd  in  mes- 
uring  the  results  of  the  teaching  process  are  discarded  and  others 
are  materially  modified,  the  proportion  of  abnormalities  occur- 
ring in  the  schools  will  not  be  materially  changed. 

STANDARDS  OF  MESUREMENT 

There  are  two  distinct  classes  of  standards  now  employd  in 
determining  the  merit  of  teaching.  These  may  well  be  cald  the 
a  priori  standards  and  the  objectiv  standards.  The  former  are 
deductions  based  upon  definitions  formd,  principles  assumed,  or 
inferences  drawn  from  known  causes.  The  latter  are  based  upon 
the  mesured  abilities  of  pupils. 

i.  A    PRIORI    STANDARDS 

This  class  of  standards  is  in  the  main  the  outgrowth  of  an 
attempt  on  the  part  of  those  who  have  been  responsible  for  the 
direction  of  educational  agencies  to  account  for  the  character  of 
the  servises  renderd  by  teachers,  on  the  basis  of  some  real  or 
imaginary  principle  either  directly  or  indirectly  related  to  the  art 
of  teaching.     The  quality  and  relativ  value  of  each  standard  in 


this  class   depends   upon   the  educational   ideals  and   insight  of 
those  who  have  establisht  it. 

The  standards  employd  in  the  early  stages  of  educational 
development  and  those  still  employd  by  persons  unfamiliar  with 
the  essentials  of  the  teaching  process  are  crude  and  often  ludic- 
rous. On  the  other  hand  the  standards  which  have  been  estab- 
lisht by  educational  experts,  in  the  light  of  recent  research,  are 
excedingly  valuable  in  that  they  stimulate  an  analysis  of  the  pro- 
cess and  give  valuable  direction  to  teaching. 

The  Attitude  of  Pupils  and  the  Community  Towards  the  Teacher 

This  standard  is  too  frequently  used  by  school  officials  in 
determining  the  efficiency  of  their  teachers.  If  the  children  and 
the  community  are  fond  of  a  teacher  it  is  assumed  that  he  is 
giving  splendid  servis  in  the  classroom.  If  he  is  not  generally 
popular  it  is  taken  for  granted  that  he  is  giving  poor  servis. 
Doutless  this  standard  was  developt  in  and  about  the  private 
school,  and  especially  the  subscription  school  where  the  teacher 
"boarded  around".  Under  such  conditions  adaptability  was  the 
prime  requisit  of  survival.  In  spite  of  the  wonderful  growth  in 
the  science  of  teaching  there  still  exists  in  some  communities 
the  notion  that  popularity  is  an  index  of  efficiency. 

It  is  reasonably  certain  that  a  teacher  of  character  and  of 
fine  teaching  ability  will  win  the  respect  and  usually  the  admira- 
tion of  his  pupils  and  patrons.  It  is  quite  as  reasonably  certain 
that  a  relativly  inferior  teacher  may  and  not  infrequently  does 
win  the  esteem  and  harty  support  of  the  entire  community  in 
which  he  teaches.  This  esteem  may  result  from  local  political 
activity,  church  connections,  participation  in  club  activities,  or  it 
may  be  in  response  to  a  wholesome  attitude  of  the  teacher  towards 
the  life  of  the  community,  all  of  which  may  be  excellent  sup- 
plementary qualities  for  a  teacher  to  possess.  Certainly  they 
should  not  be  the  main  consideration  in  the  selection  of  a  teacher. 

Being  a  "good  fellow"  is  an  enviable  human  trait,  but  it  has 
no  legitimate  place  among  the  basal  standards  which  are  employd 
in  determining  the  worth  of  teachers.  The  social  and  personal 
qualities  of  the  officers  of  a  bank  do  not  become  an  incentiv  to 
me  as  a  depositor  until  the  standing  of  the  bank  and  the  integrity 
of  the  officials  have  been  ascertaind.  The  harty  greeting  and 
the  talkativ  propensities  of  a  barber  do  not  become  an  induce- 


ment  to  me  to  patronize  his  shop  until  I  have  determind  the  fine 
quality  of  his  razor  and  the  sanitary  practises  of  his  establish- 
ment. No  thoughtful  parent  will  let  church  connections,  social 
prestige,  political  affiliations,  or  friendship  of  long  standing  be 
the  predominating  factor  in  the  choice  of  a  physician  for  his 
dangerously  sick  child.  Certainly  there  are  stronger  reasons  why 
these  supplemental  and  most  desirable  qualities  should  not  be  con- 
siderd  basic  in  the  selection  of  a  teacher. 

Character  of  Grades  and  Number  of  Promotions 

Another  common  and  widely  used  standard  of  judging 
teaching  efficiency,  closely  related  to  the  above,  is  that  of  grades 
and  promotions.  It  is  passing  strange  that  this  standard  of  mes- 
urement  should  be  relied  upon  so  extensivly.  A  parent  usually 
thinks  his  children  well  taught  if  they  receiv  high  grades.  He 
is  quite  as  strongly  convinced  of  the  teacher's  inferiority  if  his 
children  fail  of  promotion.  In  view  of  recent  investigations  in 
respect  to  the  reliability  of  grades,  as  an  index  of  actual  achieve- 
ment, this  standard  is  a  travesty  upon  the  science  of  education. 
A  grade  as  ordinarily  determind  is,  to  say  the  least,  the  expres- 
sion of  a  conglomerate  impression  which  may  be  colored  by  a 
single  performance  of  the  pupil,  by  his  general  attitude  toward 
the  school,  by  the  emotional  controls  of  the  teacher,  or  by  the 
personal  relations  which  exist  between  teacher  and  pupil  or 
between  the  teacher  and  the  family  of  the  pupil. 

Grades  vary  in  proportion  to  the  variation  of  personal  stand- 
ards. It  is  reasonable  to  suppose  that  an  easy-going  teacher  is 
more  likely  to  give  high  grades  than  is  the  teacher  who  is  ex- 
cessivly  conscientious  and  diligent  in  an  endevor  to  improve  the 
standing  of  his  pupils.  It  not  infrequently  happens  that  the 
grades  of  two  chums,  or  of  two  children  whose  families  are  inti- 
mate, are  adjusted  from  month  to  month  so  that  first  one  pupil 
and  then  the  other  has  the  higher  grade.  It  is  notorious  that 
good  children  receiv  higher  grades  in  proportion  to  their  ability 
than  do  mischievous  children.  Other  influences  well  known  to 
the  profession  are  factors  in  determining  grades.  The  multi- 
plicity of  factors  involvd  in  grade  making  is  a  strong  indictment 
of  the  practis  of  judging  teachers  exclusivly  or  even  partially 
on  the  basis  of  the  promotion  list. 


Classroom  Technique 

The  value  of  this  standard  rests  on  the  assumption  that  there 
is  a  close  correlation  between  the  character  of  the  stimuli  employd 
by  the  teacher  and  the  character  of  the  child's  controls  which 
result  from  the  use  of  such  stimuli. 

On  the  basis  of  this  assumption  one  procedes  to  determin  a 
teacher's  efficiency  by  an  examination  of  her  classroom  technique. 
The  following  items  are  usually  considerd  in  such  procedure: 
(1)  forms  of  presenting  subject-matter,  such  as  the  lecture 
method,  the  textbook  method,  the  developing  method,  including 
a  combination  of  one  or  more  of  these  methods ;  (2)  the  character 
of  the  question  employd — the  direct  question,  indirect  question, 
elliptical  question,  leading  question,  etc.;  (3)  the  sort  of  other 
devices  used — illustrations,  drawings,  field  trips,  concrete  materi- 
als for  science  work,  pictures,  maps,  etc. ;  (4)  the  language  of 
the  teacher,  his  intonation,  the  board  work,  the  general  appear- 
ance of  the  classroom,  and  especially  the  spiritual  atmosfere  of 
the  room. 

This  standard  is  decidedly  more  reliable  than  either  of  those 
previously  considerd.  It  finds  justification  in  the  common  agree- 
ment that  the  majority  of  teachers  who  get  splendid  results  employ 
a  good  technique.  In  fact,  teachers  of  this  type  find  technique 
indispensable.  It  is  in  harmony  also  with  certain  generally 
accepted  psychological  principles.  However,  technique  is  not  of 
itself  is  sufficient  guarantee  of  adequate  results,  because  of  the 
large  number  of  variables  introduced  in  its  application.  The 
value  of  a  device  depends  in  large  mesure  upon  the  experiences, 
judgment,  temperament,  zest,  clearness  of  vision,  physical  energy, 
and  high  ideals  of  the  teacher.  Without  these  attributes  in  their 
proper  proportion,  technique  in  operation  resolvs  itself  into  the 
lifeless  movement  of  school  machinery;  with  them  it  insures 
accuracy,  effectivness,  consistency,  and  the  proper  distribution 
of  time  and  energy. 

The  Reactiv  Attitude  of  the  Child 

In  discussing  the  relativ  merit  of  this  standard  with  that  of 
the  preceding  one,  F.  M.  McMurry  says :  "Teachers,  supervizors 
of  teachers,  and  authors  of  books  on  teaching,  have  been  so  in- 
tently observant  of  the  procedure  of  the  teacher  that  they  have 
overlookt  that  of  the  pupil.     Yet  the  center  of  gravity  of  the 


school  lies  in  the  pupil,  and  what  he  himself  finally  does  determins 
the  value  of  the  teacher's  efforts.  He,  therefore,  should  be  the 
primary  object  of  consideration  rather  than  the  teacher,  and  the 
quality  of  the  instruction  should  be  judged  mainly  in  terms  of 
his  activity." 

In  conformity  to  this  notion  McMurry  formulated  the  fol- 
lowing criteria  for  the  mesuring  of  teaching  efficiency :  ( 1 )  Motiv 
on  the  part  of  the  child;  (2)  Consideration  of  values  by  the 
pupils;  (3)  Attention  to  organization  by  the  pupils;  (4)Initiativ 
by  the  pupils. 

The  superiority  of  this  standard  over  those  previously  men- 
tiond  is  at  once  evident.  It  strikes  right  at  the  hart  of  the  lern- 
ing  process,  or  as  Tompkins  would  put  it,  at  the  spiritual  unity 
within  the  child.  The  author  of  the  above  criteria  not  only  be- 
lievs  in  the  theory  that  "the  center  of  gravity  of  the  school  lies 
in  the  pupil",  but  he  applies  this  theory  daily  in  his  classroom. 
Those  who  hav  attended  his  classes  know  that  he  practises  all 
that  he  preaches. 

If  the  pupil's  reactiv  attitude  is  the  key  to  educational  direct- 
ion and  the  goal  of  educational  effort,  as  we  believ  it  to  be,  it 
is  fair  to  assume  that  it  should  be  of  paramount  consideration  in 
any  attempt  to  determin  the  quality  of  teaching. 

As  principles  of  direction  the  above  criteria  are  all  that  is 
desired.  They  force  analysis  of  the  teaching  process,  and  suggest 
the  proper  distribution  and  emfasis  of  the  teaching  agencies. 
They  are  basic  to  our  whole  scheme  of  pedagogy.  To  abandon 
the  principles  underlying  these  criteria  would  be  to  ignore  teach- 
ing as  a  profession. 

Tho  indispensable  as  an  agency  for  the  improvement  of 
teaching  these  criteria  are  decidedly  inadequate  as  a  means  of 
determining  the  relativ  merit  of  teaching.  Their  inadequacy  is 
due  to  the  fact  that  the  character  of  their  application  depends 
entirely  upon  the  judgments  of  those  who  attempt  to  determin 
the  merit  of  teaching.  The  necessity  of  interpretation  introduces 
a  decided  variable. 

The  decisions  of  several  judges  as  to  the  merits  of  a  certain 
recitation  will  vary  in  proportion  to  the  variation  in  their  exper- 
iences and  insight.  What  may  seem  to  be  "motiv  on  the  part  of 
the  child"  to  one  observer  may  appear  as  excessiv  emotion  to 
another.     Indications  of  a  "consideration  of  values"  to  one  judge 


may  appear  as  a  wanton  neglect  of  essentials  to  another.  "Atten- 
tion to  organization"  to  another  observer  may  impress  his  as- 
sociates as  being  a  mere  juggling  of  facts.  Indeed,  what  may 
seem  to  one  critic  as  "initiativ  of  the  pupils"  may  appear  to  another 
as  rampant  individualism.  Just  as  the  jury  is  an  uncontrollable 
variable  in  the  machinery  of  justis,  so  the  supervizor  as  a  per- 
sonal judge  of  teaching  efficiency  is  a  variable  which  is  exced- 
ingly  difficult  to  reckon  with  in  the  application  of  the  McMurry 
criteria. 

Subjectiv  Guides  and  Scales 

Numerous  guides  and  scales  hav  been  developt  of  recent 
years  for  estimating  the  work  of  teachers.  These  ar  valuable 
to  the  supervizor  in  that  they  force  analysis  of  the  teaching  act 
and  thereby  make  it  possible  for  him  to  point  out  definitly  the 
strong  and  weak  points  in  the  recitation,  and  afford  an  oppor- 
tunity for  him  to  give  the  teacher  some  practical  suggestions  as 
to  the  improvement  of  his  methods. 

The  following  "Ten-point  scale"  is  somewhat  typical  of 
helps  of  this  sort: 

TEN-POINT    SCALE    FOR    ESTIMATING    CLASSROOM    WORK 
IN    HIGH    SCHOOLS1 

I.     "Setting"  of  class  topics  in  the  course. 
II.     Mastery  of  intellectual  content  and  effectiv  logical  organ- 
ization of  materials. 
III.     The  mechanics  of  classroom  management.     Economy  of 

time  and  grasp  of  pedagogical  technique. 
IV.     Effectiv  emfasis  upon  the  mental  processes  and  values 
peculiar  and  essential  to  the  subject. 
V.     Independence  of  teacher  and  class  as  a  growth  toward 
their  material. 
VI.     Suitability  to  the  pupil  of  the  type  of  recitation  employd. 
VII.     The  "common  sense"  factor. 
VIII.     Evidence  of  culture  versus  mere  erudition. 

IX.     Class  participation  and  class  sense  of  responsibility. 
X.     Class  respect  for  lerning. 

Scales  of  this  sort  do  not,  however,  materially  assist  the 
supervizor  in  judging  the   relativ   results   of   teaching.      In   the 


1A  tentativ  scale  now  being-  prepared  by  Professor  Charles  Hughes  Johnston 
of  the  University  of  Illinois. 


8 

application  of  this  scale  as  in  the  application  of  the  McMurry 
standards  a  markt  variable  is  introduced  in  the  judge  who  applies 
it.  Furthermore,  the  points  are  not  of  equal  significance.  Some 
of  these  points  are  several  times  more  significant  than  others. 
Two  teachers  of  widely  different  abilities  when  mesured  by  this 
scale  may  receiv  the  same  numerical  mark.  One  may  be  stronger 
in  the  essentials ;  the  other  stronger  in  the  non-essentials. 

II.  OBJECTIV  STANDARDS 

Objectiv  standards  may  be  divided  roughly  into  two  classes: 
(1)  standardised  tests;  (2)  standardised  scales.  The  former  is 
a  graded  series  of  problems  accompanied  by  the  number  of  cor- 
rect answers  obtaind  by  a  median  pupil  of  a  widely  selected  group. 
The  Courtis  Standard  Tests,  The  Kansas  Silent  Reading  Test, 
and  The  Thorndike  Reading  Scale  are  standards  of  this  type. 
The  latter  is  an  arrangement  of  the  carefully  prepared  work  of 
pupils  into  an  evenly  graded  system  which  has  been  determind 
and  evaluated  by  a  number  of  competent  judges.  Thorndike's 
and  Ayres'  Handwriting  Scales,  The  Harvard-Newton  Com- 
position Scales,  and  Thorndike's  Drawing  Scale  are  standards  of 
this  type. 

A  historical  survey  of  the  objectiv  standards,  accompanied  by 
a  discussion  of  their  relativ  merit,  is  perhaps  the  easiest  and  dout- 
less  the  most  pedagogical  way  of  showing  the  relativ  educational 
value  of  these  standards  as  agencies  in  determining  the  quality 
of  teaching  and  in  paving  the  way  for  placing  teaching  upon  a 
scientific  basis,  a  distinction  which  it  does  not  as  yet  merit. 
Origin  of  Objectiv  Standards  in  America 

So  far  as  I  can  ascertain,  Dr.  J.  B.  Rice  is  the  father  of  the 
objectiv  standard  in  America.  Zelous  for  better  opportunities 
for  the  child,  enthused  by  his  recent  psychological  studies  at  Jena 
and  Leipsic,  free  from  prejudices  which  sometimes  result  from 
inferior  teaching,  he  set  for  his  task  the  exposition  of  certain 
evils  which  he  conceivd  to  exist  in  the  public  schools.  Conse- 
quently from  1891  to  1896  he  became  a  critical  student  of  educa- 
tion. He  visited  and  examind  the  schools  of  one  hundred  Amer- 
ican cities.  He  pointed  out  in  the  colums  of  the  "Forum"  what 
seemd  to  him  remedial  mesures  for  these  schools.  After  four 
years  of  constant  investigation  he  came  to  the  very  decided  con- 
viction that  concerted  effort  towards  obtaining  satisfactory  re- 


suits  in  public  education  is  impossible  until  we  know  what 
satisfactory  results  are.  "If  we  do  not  know",  he  wrote  in  the 
"Forum",  December,  '96,  "what  we  mean  by  satisfactory  results, 
how  shall  we  be  able  with  any  degree  of  intelligence  to  judge 
when  our  task  has  been  satisfactorily  performd?  Until  we  come 
to  a  definit  understanding  in  regard  to  this  matter,  our  entire 
educational  work  will  lack  direction  and  we  shall  continue  as 
heretofore,  to  grope  our  way  along  passages  completely  envelopt 
in  darkness  in  an  endevor  to  land  we  know  not  where. 

"If  we  might  have  a  standard  which  would  enable  us  to  tell 
when  our  task  has  been  completed,  our  attention  might  be 
earnestly  directed  towards  the  discovery  of  short  cuts  in  educa- 
tional processes.  For  want  of  such  a  standard  each  individual 
teacher  has  thus  far  been  a  law  unto  himself ;  permitted  to  ex- 
periment on  his  pupils  in  accordance  with  his  own  individual 
educational  notions,  whether  inherited  from  his  grandfather  or 
the  result  of  his  study  and  reflection,  entirely  regardless  of  what 
was  being  done  by  others.  So  long  as  this  condition  is  possible, 
pedagogy  cannot  lay  claims  to  recognition  as  a  science.  Until  an 
accurate  standard  of  mesurement  (my  italics)  is  recognized  by 
which  such  truths  may  be  discoverd,  ward  politicians  will  con- 
tinue to  wield  the  baton  and  educational  anarchy  will  continue  to 
prevail." 

The  First  Objectiv  Standard 

Dr.  Rice  was  not  a  faddist.  Indeed,  he  was  excedingly 
practical.  In  his  characteristic  way  he  set  out  in  1896  to  establish 
a  standard  of  mesurement  for  spelling.  He  undertook  personally 
the  herculean  task  of  examining  13,000  children  in  spelling.  This 
investigation  extended  over  a  period  of  sixteen  months  and  in- 
cluded sixteen  American  cities. 

The  children  were  tested  on  a  list  of  words,  on  words  given 
to  them  in  sentences,  and  on  the  words  used  in  their  compositions. 
The  tabulated  results  in  the  "Forum"  for  April,  1897,  is,  so  far  as 
I  have  discoverd,  the  first  objectiv  standard  in  spelling  or  in  any 
other  subject.  The  list  of  words  standardized  by  him  consists 
of  too  few  words  to  be  of  servis  in  judging  the  spelling  abilities 
of  children.  The  list  of  words  presented  in  sentences  is  subject 
to  the  same  criticism.  This  objection  does  not  hold  for  his  com- 
position test.  Had  he  estimated  the  percent  of  words  correctly 
speld  in  the  compositions  on  the  basis  of  the  number  of  different 


10 

words  used,  insted  of  upon  the  basis  of  the  entire  number  of  words 
used,  he  would  have  establisht  the  first  practical  objectiv  standard. 
As  it  is  his  percents  of  words  correctly  speld  are  entirely  too 
high. 

Rice's  Arithmetic  Test 

In  the  October  number  of  the  "Forum",  1902,  Dr.  Rice  re- 
ported the  results  of  an  arithmetic  test  which  he  had  conducted 
in  seven  different  cities,  including  eighteen  bildings  and  8,000 
children.  As  Stone  pointed  out,  later,  Dr.  Rice's  results  were  not 
satisfactory  as  a  standard,  due  to  certain  limitations  in  the  prob- 
lems used  and  the  character  of  the  methods  employd  in  gathering 
and  scoring  these. 

Rice's  Language  Test 

One  year  later  Dr.  Rice  gave  a  detaild  report  of  the  test  he 
made  in  language.  This  test  extended  to  nine  cities,  and  included 
twenty-two  schools,  containing  8,300  children.  The  compositions 
were  arranged  in  five  groups  on  the  basis  of  relativ  merit.  The 
papers  of  each  group  were  graded  100%,  75%,  50%,  25%, 
0%  respectivly.  The  results  showd  conclusivly  that  there  was 
a  wide  variation  in  the  English  abilities  tested  by  him,  but  owing 
to  the  strong  probability  of  error  in  his  results  they  hav  not 
been  employd  as  a  standard  for  determining  English  ability. 

Tho  Dr.  Rice's  results  are  of  little  value  as  standards,  his 
experiments  have  stimulated  two  lines  of  research  in  education 
which  are  fraught  with  wonderful  possibilities.  I  refer  on  the 
one  hand  to  the  investigations  which  have  had  for  their  goal  the 
establishment  of  objectiv  standards  of  mesurement,  and  on  the 
other  to  the  investigations  to  determin  minimum  essentials.  Both 
of  these  problems  were  raisd  by  Dr.  Rice  and  he  has  lived  to  see 
some  partial  solutions  of  both. 

The  Cornman  Spelling  Standard 

Dr.  O.  P.  Cornman,  of  Philadelphia,  stimulated  by  the  work 
of  Rice,  carried  on  a  series  of  tests  in  spelling  by  the  composition 
method,  extending  from  June,  '96,  to  June,  '98.  In  1903  he 
publisht  the  results  of  this  investigation  in  a  volume  entitled 
Spelling  in  the  Elementary  School.  Cornman's  data  were  care- 
fully gatherd  and  the  results  methodically  tabulated.  He  sub- 
stituted the  median  for  the  average  employd  by  Rice.  In  his 
composition  test  Rice  counted  all  the  words  which  were  speld 


11 

correctly,  including  all  recurring  words  when  properly  speld. 
When  a  misspeld  word  recurd  he  counted  it  but  once.  On  this 
basis  of  counting  he  determind  the  spelling  abilities  of  the  chil- 
dren in  terms  of  the  percentage  of  words  speld  correctly.  This 
accounts  for  the  high  percentages  which  he  reported.  Cornman 
counted  all  words  in  the  composition  and  determind  the  ratio  of 
the  speld  words  and  misspeld  words  in  terms  of  percent.  He 
not  only  counted  the  recurring  words  which  were  speld  correctly 
but  the  recurring  misspeld  words  as  well.  This  accounts  for  his 
percentages  being  lower  than  those  reported  by  Rice. 

The  work  of  Rice  and  Cornman  stimulated  many  young  men 
in  the  large  educational  centers.  Edward  L.  Thorndike,  who  has 
since  become  the  wizard  of  the  objectiv  standard,  wrote  in  the 
"Forum"  in  1905  as  follows :  "The  study  of  education  is  begin- 
ning to  be  quantitativ,  we  are  becoming  properly  disgusted  with 
the  one-sided  booking  which  only  takes  account  of  dollars  spent 
and  neglects  the  debit  side,  the  income  in  knowledge,  habits, 
power,  zeal  and  ideals.  This  ambition  toward  an  exact  objectiv 
mesurement  of  the  results  of  educational  endevor  is  a  symptom 
of  helthy  scientific  fervor  and  also  of  common  sense  wisdom. 
No  one  possest  of  science  or  sense  will  deny  the  value  of  suc- 
cessful quantitativ  study  of  school  work." 

Arithmetic  Abilities  of  Children  in  the  Sixth  Grade  (Stone) 

In  1908  C.  W.  Stone  publisht  in  the  Columbia  University 
Contributions  to  Education  a  report  on  the  arithmetical  abilities 
of  children  in  the  six-A  grade.  Mr.  Stone  personally  conducted 
the  examinations  in  twenty-six  school  systems,  including  seventy- 
nine  schools  and  6,000  children.  He  gave  one  test  in  the  funda- 
mentals and  one  in  the  reasoning  processes. 

Stone's  method  of  gathering  data  and  of  tabulating  results 
was  superior.  He  set  a  standard  in  this  particular  which  has 
been  emulated  by  later  investigators.  The  exercises  in  the 
tests  proved,  as  Courtis  pointed  out,  too  complex  for  practical 
mesurements.  The  results  were  a  mesure  of  a  combination  of 
abilities  in  the  fundamentals  and  in  the  reasoning  processes,  and 
consequently  were  difficult  of  interpretation  and  application. 
Because  of  the  difficulty  of  applying  his  results  they  have  not 
been  used  extensivly  in  determining  arithmetical  abilities. 


12 
The  Thorndike  Handwriting  Scale 

The  first  satisfactory  result  from  a  practical  point  of  view  of 
all  the  agitation  for  quantitativ  standards  of  mesurement  occurd 
in  1910.  The  Thorndike  Scale  for  Judging  Handwriting  appeard 
in  the  Teachers  College  Record  of  that  year.  Referring  to  this 
scale,  Ayres  says,  "The  credit  of  developing  the  first  mesuring 
scale  for  handwriting  belongs  to  Professor  Edward  L.  Thorndike 
of  Teachers  College,  Columbia  University.  The  publication,  in 
March,  1910,  of  his  Handwriting  Scale  constituted  a  most  im- 
portant contribution  not  only  to  experimental  pedagogy,  but  to 
the  entire  movement  for  the  scientific  study  of  education." 

In  reference  to  the  need  of  such  a  scale  Thorndike  said,  "At 
present  we  can  do  no  better  than  estimate  a  handwriting  as  very 
bad,  good,  very  good,  or  extremely  good,  knowing  only  vaguely 
what  we  mean  thereby,  running  a  risk  of  shifting  our  standards 
with  time,  and  only  by  chance  meaning  the  same  by  a  word  as 
some  other  student  of  the  facts  means  by  it.  We  are  in  the 
condition  in  which  the  students  of  temperature  were  before  the 
discovery  of  the  thermometer,  or  any  other  scale  for  mesuring 
temperature  beyond  the  very  hot,  hot,  warm,  lukewarm,  and  the 
like,  of  subjectiv  opinion." 

Altho,  as  Ayres  pointed  out,  this  Handwriting  Scale  con- 
stituted a  most  important  contribution  not  only  to  experimental 
pedagogy  but  to  the  entire  movement  for  the  scientific  study  of 
education,  Professor  Thorndike  in  his  presentation  was  sensitiv 
of  its  imperfections.  He  says :  "The  scale  is  presented  now  in 
spite  of  its  imperfections,  for  these  reasons.  It  is  the  result  of 
some  twenty  ratings,  and  ensures  mesurements  far  more  accurate 
than  anyone  could  make  without  it.  For  the  present  needs  of 
school  practis  and  educational  research,  a  very  precise  instrument 
for  mesuring  handwriting  is  not  required.  The  best  way  to  get 
a  more  perfect  scale  is  by  the  use  of  this  one  as  a  starting  point." 

The  Thorndike  scale  represents  types  of  the  handwriting  of 
children  of  grades  five  to  eight  inclusivly.  The  writing  from 
these  grades  was  groupt  into  eleven  groups  on  the  basis  of 
quality.  The  quality  of  the  groups  is  represented  by  figures  7,  8, 
9,  10,  11,  12,  13,  14,  15,  16,  and  17  respectivly.  Quality  7  repre- 
sents the  poorest  samples  taken  from  grade  five  and  quality  17 
represents  the  best  samples  taken  from  grade  eight.     The  steps 


13 

of  difference  between  the  qualities  were  equal  in  the  sense  of 
being  cald  equal  by  from  twenty-three  to  fifty-five  competent 
judges.  This  means  "that  14  is  as  much  better  than  13  as  13  is 
than  12 ;  that  13  is  as  much  better  than  12,  as  12  is  better  than  11, 
and  so  on ;  that  quality  14  is  two  times  as  far  above  zero  merit  in 
handwriting  as  quality  7. 

The  scale  includes  quality  18,  which  was  taken  from  a  copy 
book,  and  qualities  4,  5,  and  6.  Samples  5  and  6  were  taken  from 
the  fourth  grade  and  sample  4  was  manufactured  for  the  purpose 
of  extending  the  scale  below  the  merit  of  fourth-grade  children. 

The  Thorndike  Handwriting  Scale  is  easily  applied  in  testing 
the  quality  of  handwriting.  After  a  little  experience  a  teacher 
can  scale  the  writing  of  her  entire  room  in  a  very  short  time.  By 
means  of  such  a  scale  we  have  often  mesured  the  writing  of  an 
entire  room  in  less  time  than  a  forty  minute  period.  The  several 
samples  supplied  for  each  of  the  qualities  16,  15,  14,  13,  12,  11, 
9,  and  8  make  it  especially  easy  to  apply  this  scale. 

Teachers  who  habitually  think  of  quality  in  terms  of  grades 
can,  for  all  practical  purposes,  easily  transfer  the  qualities  of  the 
scale  into  grades  by  multiplying  the  numbers  of  the  scale  by  5.8. 
Those  who  have  mesured  the  merit  of  handwriting  with  this  or 
the  Ayres'  scale  will  not  be  content  to  judge  the  merit  of  writing 
in  terms  of  personal  experience. 

The  Ayres  Handwriting  Scale 

In  November,  1911,  Leonard  P.  Ayres  of  the  Russel  Sage 
Foundation  began  a  preliminary  experiment  to  determin  the 
relativ  legibility  of  different  samples  of  handwriting.  He  early 
concluded  that  the  scheme  was  feasible  and  proceded  to  perfect 
a  writing  scale  on  that  basis.  His  first  printed  scale  appeard  in 
P'ebruary,  1912.  In  discussing  the  merits  of  this  scale  he  says: 
"The  method  by  which  the  present  scale  has  been  produced,  and 
the  criterion  on  which  it  rests  as  a  basis  differ  radically  from 
those  adopted  by  Professor  Thorndike.  The  difference  in  the 
basis  is  that  in  the  present  case  legibility  has  been  adopted  as  a 
criterion  for  rating  the  different  samples  in  place  of  'general 
merit'  used  as  the  basis  of  Thorndike's  scale.  The  change  sub- 
stitutes function  for  appearance  as  a  criterion  for  judging  hand- 
writing." 


14 

Ayres  gatherd  1,578  samples  of  writing  from  forty  school 
systems.  The  samples  were  red  by  ten  readers,  each  of  whom  by 
means  of  a  stop  watch  recorded  the  exact  number  of  seconds 
required  to  read  each  sample.  The  samples  were  then  placed 
in  eight  groups  on  the  basis  of  the  time  required  to  read  them. 
The  following  table  shows  the  rating  of  a  type  sample  of  each 
group. 

Table  I 

Rating  in  words  red  per  minute, 
Point  on  scale  of  sample  found  at  each  point 

90%    209.2 

80%   202.7 

70%    195.1 

60%   186.2 

50%   175.7 

40%   163.4 

30%  149.1 

20%   132.2 

The  scale  was  divided  into  three  longitudinal  divisions  on  the 
basis  of  slant.  The  top,  or  A  division,  contains  the  vertical 
samples.  The  middle  or  B  division,  contains  the  samples  of 
medium  slant,  and  the  lower,  or  C  division,  contains  the  samples 
of  extreme  slant.  As  implied  in  the  above  table  the  scale  is  divided 
into  eight  vertical  divisions,  each  of  which  contains  a  sample  of 
each  slant.  The  three  samples  in  the  right  colum  are  markt  90%, 
those  in  the  next  colum  to  the  left  80%,  etc. 

Because  of  its  inclusion  of  samples  representing  the  three 
main  types  of  slant,  this  scale  is  easily  applied.  The  application 
of  this  scale  to  the  handwriting  of  most  school  systems  at  once 
reveals  wide  variation  in  writing  abilities,  which  implies  either 
widely  different  methods  of  teaching,  widely  different  ideals  as 
to  the  sort  of  writing  which  should  obtain,  or  widely  different 
degrees  of  zeal  towards  securing  good  writing.  The  following 
graph  (Figure  I)  of  the  writing  abilities  of  the  children  of  the 
Training  School  of  the  Illinois  State  Normal  University,  as  shown 
by  the  first  application  of  the  Ayres  scale,  reveals  the  sort  of 
variation  which  frequently  exists  when  subjectiv  standards  alone 
are  relied  upon. 


15 

The  application  of  the  Handwriting  Scale  not  only  reveald 
wide  variation  within  each  grade,  but  it  reveald  wide  variation 
between  the  grades  as  well.  This  first  application  of  the  scale 
showd  that  there  were  two  children  in  the  sixth  grade  who  wrote 
better  than  any  of  the  children  of  the  seventh  and  eighth  grades. 
It  showd  also  that  there  were  six  children  in  the  sixth  grade  who 
made  a  grade  of  70  while  there  were  but  four  children  in  both 
the  seventh  and  eighth  grades  who  reacht  the  70  mark.  This 
test  made  it  perfectly  evident  on  the  one  hand  that  grades  five 
and  six  needed  no  extra  consideration  relativ  to  drill  in  writing, 
while  on  the  other  hand  it  showd  that  grades  seven  and  eight 
needed  a  writing  revival. 

The  graph  (Figure  II)  shows  what  was  accomplisht  by  the 
eighth-grade  teacher  after  he  became  conscious  of  the  relativ 
needs  of  his  pupils.  In  the  November  test  fifteen  pupils  made 
grades  of  40%  or  less.  In  the  May  test  none  made  a  grade  less 
than  50%.  In  the  November  test  only  two  pupils  made  grades 
of  70%  while  in  the  May  test  ten  pupils  made  grades  of  70%,  ten 
pupils  made  grades  of  80%,  and  three  pupils  made  grades  of 
90%.  A  careful  examination  of  Figure  II  will  reveal  other 
markt  changes  which  resulted  from  an  application  of  the  Hand- 
writing Scale. 

A  record  of  the  writing  ability  of  the  children  of  grades  four 
to  eight  inclusiv,  taken  in  November  and  May,  and  filed  for 
reference  becomes  a  definit  and  valuable  guide  for  any  school. 
It  makes  it  possible  to  determin  at  any  time  whether  a  sufficient 
amount  of  time  and  energy  is  being  given  to  this  subject.  Such  a 
record  protects  the  children  from  the  excessiv  zeal  or  the  indif- 
ference of  the  teacher,  and  indicates  to  the  teacher  the  relativ 
merit  of  his  endevor. 

Starch's  Letter- Exposure  Handwriting  Test 

Professor  Daniel  Starch  of  the  University  of  Wisconsin, 
reported  his  handwriting  test  in  the  Journal  of  Educational  Psy- 
chology for  October,  1913.  He  pointed  out  that  the  Thorndike 
and  Ayres  scales  were  mesures  only  of  form  and  legibility 
respectivly.  He  argued  that  a  simple  analysis  of  handwriting 
shows  that  its  three  chief  elements  are  legibility,  producibility, 
and  form. 

Starch  held  that  legibility  can  best  be  determind  by  reading 


16 


; 

\ 

» 

« 

c 

.« 

>          ; 

i 

i 

s           ! 

»         e 

5        « 

s 

J 

/ 

I5 

y 

<•- 

4i 

*o 

/ 

• 

y 

y 

/ 

t 

7 

i 

H 

^ 

j 

R 

f"*^ 

^ 

X^ 

- 

*^^ 

.*»""' 

-*•* 

^ 

V 

V 

57 

/ 

/ 

• 

V! 

o 

^ 

*>> 

"% 

> 

t 

* 

< 
1 

^0-* 

V. 

i 

1 

lv 

%* 

\ 

\ 

\ 

.  > 

\ 

1 

« 

• 

1 

V 

% 
x 

1M 

i 
{ 

1 

\ 

s 

% 

\ 

X 

> 

» 

• 
% 

s 

N 

k 

\ 

9 

• 

1 

> 

^.s 

\ 

« 

% 

% 

•-... 

•% 

s. 

\ 

A 

\ 

—\ 

3* 

V 

— +- 

^ 

* 

ST 


•*§£ 


3° 


u  2  b  w 

*-  c  <u  a 

2  Hi 


.at 

is*?! 

bo'«3  "c3 
i  c>  § 


£j£ 


132 


si-2 
oil 


u  a>  <l>  o  v 

to  o3  ci  rt  $ 
u   u   u   u,    u 

-t  ih  o  in  oo 
I 


17 


^ 

I     * 

« 

1      5 

:     A 

<: 

L       i 

>       t 

s.             V 

\   'f 

>w          (« 

J           » 

i 

^ 

^ 

*** 

^ 

ms 

4 

9 

«" 

*' 

„  4 

#•* 

S 

*■ 

if- 

n 

t 

/ 

6 

^ 

1 

1 

/ 

f 

% 

/ 

I 

9 

* 

/ 

i 

t 

^s 

K 

% 

-? 

*t 

% 

s 

' 

\ 

\ 

ft 
% 

\ 

* 

/ 

/ 

^ 

a 

/ 

* 

<B 

/ 

\ 

1 

t 

% 

/ 

\ 

• 

% 

> 

f 

» 
\ 

/ 

* 

1 

V 

\ 

\ 

ft 

% 

\ 

\ 

% 

• 
* 

V 

» 

>? 

\ 

^ 

\ 

\ 

k 

\ 

K 

^ 

"If 

N 

N 

•n 

^ 

•n 

*. 



X 

NS 

» 

•a-g 

J* 

&*3S 


1*. 

•s  . 


§  en 


ft  6 


if 


18 

exposed  areas  of  handwriting  and  thereby  determining  the 
average  rate  per  letter  of  such  reading.  In  conformity  with  this 
theory  he  prepared  a  device  for  mesuring  handwriting  as  follows : 
In  a  piece  of  cardboard  were  cut  three  circular  openings  in  a 
straight  row  1.5  cm,  apart.  The  openings  were  each  2.5  cm.  in 
diameter.  By  shifting  the  cardboard  about  over  the  writing  to 
be  mesured,  he  was  able  to  test  its  legibility  at  several  places. 
The  number  of  letters  exposed  and  the  time  required  to  read 
them  were  recorded  after  each  trial.  From  the  records  of  several 
exposed  areas  the  average  reading  per  letter  was  computed. 

Starch's  experiments  proved  that  there  is  a  remarkably  close 
correlation  in  the  results  obtaind  by  the  Letter-Exposure  Test 
and  those  secured  by  the  Thorndike  and  Ayres  scales. 

It  is  doubtful  if  the  Letter-Exposure  Test  is  as  convenient 
for  testing  the  handwriting  of  large  numbers  of  children  as  is 
either  the  Thorndike  or  Ayres  scales. 

After  testing  the  efficiency  of  writing  scales  Starch  says : 
"We  may  conclude  that  after  some  practis  in  the  use  of  a  scale 
the  mesurements  with  either  scale  are  from  three  to  four  times 
as  accurate  as  the  valuations  made  by  the  usual  percentil  marking 
system." 

The  Courtis  Standard  Tests 

In  December,  1910,  W.  S.  Courtis,  of  Detroit,  reported  in 
The  Elementary  School  Teacher  his  Standard  Test  (Series  A) 
in  Arithmetic.  This  test  developt  as  a  result  of  applying  the 
Stone  test  in  the  Detroit  Home  and  Day  School,  in  which  Mr. 
Courtis  was  hed  of  the  Department  of  Science  and  Mathematics. 
After  a  free  use  of  his  Series  A  Test,  which  consisted  of  testing 
the  pupils'  ability  to  use  the  four  fundamental  processes  when 
employd  in  tables  ordinarily  used  in  schoolrooms,  and  of  testing 
the  pupils'  ability  to  employ  the  reasoning  processes  involvd  in 
the  solution  of  problems  suitable  to  the  grammer  grades,  Mr. 
Courtis  concluded  that  "The  work  done  with  Series  A  has  proved 
that  the  basic  problem  in  education  to-day  is  that  of  ministering 
adequately  to  individual  needs.  The  first  step  towards  this  end 
is  the  formation  of  definit  objectiv  standards."  The  standards 
derived  from  the  use  of  Series  A,  however,  are  either  complex  or 
of  questionable  value,  owing  to  the  uncertainty  of  their  meaning. 


19 


This  is  particularly  true  of  the  reasoning  tests  in  which  mere 
ability  to  read  is  a  large  factor. 

Series  B  is  the  result  of  an  attempt  to  secure  definit  objectiv 
standards  for  each  of  the  four  fundamental  operations  with  whole 
numbers.  With  the  establishment  of  this  standard  it  is  possible 
to  set  for  each  grade  just  the  degree  of  skill  in  each  of  the 
fundamental  processes  that  is  within  reach  of  the  average,  or 
median,  child  of  the  grade. 

The  following  table  shows  the  median  skills  of  three  distinct 
groups  of  children  in  the  fundamentals  of  arithmetic  provided 
in  the  Courtis  test.  The  approximation  of  the  series  reveals  the 
universal  character  of  the  results. 

Table  II 


5tr 
D. 

i  grade 
B. 

G. 

6th  grad 
D.        B. 

G. 

Addition    

...    A 
R 

6.7 
3.9 

7.2 
3.7 

7.1 
37 

8.4 
4.6 

8.3 
4.9 

8. 

4.4 

Subtraction 

...   A 
R 

8. 

5.5 

7.6 
4.9 

6.5 
4.9 

8.8 
6.2 

9. 
6.3 

8.9 
6.1 

Multiplication  . . . 

...  A 
R 

6. 
3.8 

5.8 
3.3 

6. 
2.6 

7.4 
4.8 

6.9 

4.8 

7.2 

4.5 

Division 

...  A 
R 

4.9 
27 

4.5 
2. 

4.5 
2.3 

6.4 
4.4 

5.5 
3.3 

5.8 
4.3 

7th 
D. 

grade 
B. 

G. 

8th 
D. 

grade 
B. 

G. 

Addition    

...    A 
R 

9.2 
3.4 

9.2 
5.6 

8.9 
4.7 

10.2 
6.7 

11. 

7.5 

97 
5.6 

Subtraction   

...  A 
R 

9.8 
7.3 

10. 
6.9 

10.2 
7.8 

12.3 
9.5 

11.4 
8.6 

11.7 

8.4 

Multiplication   .  .  . 

...  A 
R 

9.6 
6. 

8. 
5.1 

8.4 

5.2 

10.5 
7. 

9.5 
6.5 

9.7 
6.4 

Division 

...  A 
R 

8.6 
7.1 

6.9 
5.1 

7.6 
5.1 

10.6 
8.8 

6.9 
6.9 

7.6 
6.3 

D  =  Detroit  (1,315  children  tested) 
B==  Boston  (20,441  children  tested) 
G  =  General  (3,618  children  tested) 
A  =  Number  of  problems  attempted 
R  =  Number  of  problems  right 


20 

Courtis  early  discoverd  the  value  of  the  objectiv  standard  in 
determining  individual  variation.  He  says :  "The  results  of  the 
tests  disclosed  the  usual  wide  range  of  individual  variation  in 
every  grade."  After  a  use  of  the  objectiv  standard  for  some 
time  Professor  Courtis  writes:  "Not  only  did  the  variabilities 
decrease,  but  unhoped  degrees  of  accuracy  were  attaind." 

The  following  graphs  of  the  abilities  of  intermediate  pupils 
in  multiplication  and  oral  reading  as  determind  by  the  Courtis 
and  Gray  scales  show  conclusivly  how  variability  is  easily  detected 
by  the  application  of  objectiv  standards. 

The  graphs  shown  in  Figure  III  reveal  two  distinct  groups 
of  abilities  in  each  subject.  This  may  mean  that  little  care  has 
been  given  to  promotions.  It  is  more  likely  to  indicate  a  lack 
of  sufficient  drill  under  proper  conditions.  After  the  abilities  are 
once  reveald  there  is  every  reason  to  believ  that  a  conscientious 
teacher  will  raise  the  abilities  of  the  lower  group  and  thereby 
reduce  the  degree  of  variability. 

Just  as  a  proper  diagnosis  in  medicin  is  a  prerequisit  to 
effectiv  medical  treatment,  so  a  proper  diagnosis  of  the  specific 
abilities  of  pupils  is  a  prerequisit  to  the  application  of  proper 
methods. 

The  Hillegas  Scale  for  the  Mesurement  of  Quality  in  English 
Composition 

In  September,  1912,  Professor  M.  B.  Hillegas  publisht  his 
composition  scale  in  The  Teachers  College  Record.  In  the  intro- 
duction to  this  scale  Professor  Hillegas  refers  to  the  previous 
efforts  at  quantitativ  standards  by  Cornman,  Rice,  Stone,  and 
Thorndike.  He  does  not,  however,  refer  to  Rice's  pioneer  effort 
to  establish  a  standard  in  English  composition  in  1902. 

Hillegas  used  a  method  similar  to  the  one  Thorndike  used  in 
determining  quality  in  handwriting.  He,  aided  by  one  other 
person,  graded  about  7,000  compositions  into  ten  classes.  From 
these  ten  classes  seventy-five  samples  were  chosen.  Artificial 
samples  were  employd  at  the  extremes  of  his  scale,  as  they  were 
in  Thorndike's  writing  scale,  in  order  to  produce  a  scale  of  wide 
range  of  mesurement.  In  all  there  were  eighty-three  samples 
employd.  These  eighty-three  samples  were  given  to  more  than 
one  hundred  persons,  who  were  requested  to  rank  them  1,  2,  3, 
etc.,  in  the  order  of  their  merit. 


21 


t 


S3      3 


*_JL 


«k        ^     «|a      tL        J>      y>         Jfc. 


K 


i*  ';. 


/ 


/ 


/ 


/ 


s 


/ 


./ 


f 


\ 


>' 


\ 


i  Ss 

1  l« 

s  I  SS 

8"  a  g 
"  3  £> 
e       ©I 

!  P 

i  if 

8  l! 

^     co        to«« 

slsl 


j« 


c     c 

co        en 

«i  Id 

•2  tc'-S  2 

C8  c  *  '*3 

ofoo 

<       < 


22 

Owing  to  misunderstandings  and  errors,  only  seventy-three 
records  were  used.  On  the  basis  of  like  characteristics  these 
records  were  reduced  to  twenty-three.  This  reduced  number  of 
samples  containd  all  the  important  steps  in  quality  from  the  poor- 
est to  the  best.  Six  other  samples,  including  two  artificial  ones, 
were  finally  added,  making  a  total  of  twenty-nine  samples. 

The  twenty-nine  samples  were  rankt  by  234  judges.  On  the 
basis  of  this  ranking  the  number  of  samples  was  reduced  to  ten. 
The  difference  between  the  merit  of  the  first  and  second  samples 
in  the  scale  is  not  identical  with  the  difference  in  merit  of  any 
other  two  successiv  samples.  These  differences,  however,  are 
sufficiently  equal  for  practical  purposes. 

The  Hillegas  scale  is  a  meritorious  piece  of  work.  It  is  a 
decided  step  in  the  right  direction.  The  brevity  of  the  samples 
and  the  gradual  gradation  from  one  quality  to  another  makes  its 
application  from  this  point  of  view  quite  easy.  The  Hillegas 
scale,  tho  a  meritorious  piece  of  work,  has  many  defects.  Com- 
menting upon  the  Hillegas  scale,  Frank  W.  Ballou  of  the  Depart- 
ment of  Educational  Investigation  and  Mesurement  of  the  Boston 
Schools  says :  "An  experiment  with  the  Hillegas  scale  showd 
that  the  use  of  such  an  objectiv  mesure  did  unify  the  grades 
given  to  compositions  by  teachers.  It  was  also  found,  however, 
that  the  Hillegas  scale  was  not  satisfactory  to  the  teachers  of 
Newton,  owing  to  what  seemd  to  them  to  be  inherent  faults. 
These  faults  may  be  stated  briefly  as  follows :  first,  the  scale  aims 
to  mesure  too  varied  a  product ;  second,  the  compositions  in  it  are 
not  typical  of  good  school  work — (a)  some  are  artificial,  (b) 
others  are  'bookish',  really  reproductions,  and  (c)  no  conversa- 
tion is  containd  in  any  of  them."  As  Courtis's  practical  tests  in 
arithmetic  grew  out  of  an  attempt  to  use  the  conclusions  of  Stone, 
so  an  attempt  on  the  part  of  the  teachers  of  Newton,  Mass.,  to 
use  the  Hillegas  scale  led  directly  to  the  practical  Harvard-New- 
ton Scales  for  the  Mesurement  of  English  Composition. 

Report  of  Superintendent  Bliss  on  English  Composition 

While  at  Elmira,  N.  Y.,  Superintendent  Bliss  reported  in  the 
Psychological  Clinic  for  March,  1912,  a  series  of  tests  he  had 
carried  on  in  composition.  He  had  the  children  reproduce  stories 
red  to  them.  These  reproductions  were  taken  to  the  central  offis 
and  groupt,  on  the  plan  practist  by  Rice,  into  five  groups.     He 


23 

determine!  the  median  ability  for  all  of  the  children  in  each  of  the 
grades  above  the  third.  He  then  reported  the  median  ability  for 
all  of  the  children  of  that  grade  in  the  city  with  the  median  for 
the  particular  grade  in  the  school.  He  also  publisht  sample 
compositions  of  each  group  of  compositions  in  the  scale. 

The  results  obtaind  from  the  use  of  this  scheme  were  little 
less  than  marvelous.  He  says :  "In  a  Massachusetts  school 
system,  with  33  third-grade  teachers  the  initial  test  showd  a  city 
average  of  8.5  points,  with  twenty-three  classes  below  the  re- 
quirement and  eight  classes  above.  One  year  later  the  city  average 
was  19.2  points,  with  thirteen  classes  below  the  requirement  and 
nineteen  classes  above.  This  represented  an  increase  of  126% 
in  the  level  of  efficiency  in  the  third  grade."  Mr.  Bliss  cites  other 
cases  where  even  greater  percents  of  increase  were  made  by  the 
use  of  this  method. 

The  Harvard-Newton  Scales 

These  scales  are  the  product  of  the  work  of  the  eighth-grade 
teachers  and  the  elementary-school  principals  in  the  public  schools 
of  Newton  Mass.,  assisted  by  the  teachers  of  English  in  the  high 
schools  of  Newton,  and  by  teachers  and  principals  in  Arlington, 
Mass.,  and  Boston,  under  the  direction  of  Frank  W.  Ballou  and 
with  the  co-operation  of  the  Joseph  Lee  Fellow  for  Research  in 
Education. 

The  compositions  were  written  by  the  eighth-grade  pupils  of 
Newton.  All  of  the  compositions  of  the  eleven  grade  schools 
were  groupt  into  five  groups.  Each  group  included  specimens  of 
a  given  type  of  composition  (narration,  description,  etc.).  Each 
eighth-grade  teacher  selected  25%  of  the  compositions  of  her 
grade  on  the  basis  of  their  representativ  merit.  These  selected 
compositions  from  the  eleven  schools  were  then  arranged  into 
four  groups.  Twenty-four  readers  were  instructed  to  arrange 
the  themes  in  each  group  in  the  order  of  their  merit  and  to  arbi- 
trarily rate  the  best  theme  95%  and  each  of  the  remaining  themes 
with  reference  to  this  standard.  These  ratings  were  tabulated 
and  the  median  grade  for  each  composition  was  workt  out.  For 
example,  the  highest  grade  for  composition  number  one  was 
95%,  the  lowest  grade  was  68%,  and  the  median  grade  was  83%. 
In  like  manner  tabulation  was  made  of  the  distribution  of  the 


24 

ranks  given  each  composition.  They  were  then  arranged  in 
serial  order  according  to  the  median  ranks,  beginning  with  the 
highest.  By  means  of  this  latter  method  it  was  discoverd  that 
25%  of  the  judges  were  radical  in  their  judgment.  Consequently 
the  25%  of  radical  readers  were  cut  off.  The  scale  was  then 
bilt  on  the  median  percentil  basis.  Out  of  the  twenty-five  composi- 
tions which  were  chosen  to  represent  each  form  of  discourse, 
six  typical  compositions  were  finally  chosen  for  the  scale.  The 
difference  in  degree  of  quality  was  carefully  workt  out  and 
the  samples  were  arbitrarily  markt  95%,  85%,  75%,  65%, 
55%,  and  45%,  respectivly. 

The  Harvard-Newton  Scales1  commend  themselvs  to  the 
practical  school  man  on  the  following  points :  first,  there  is  a  scale 
for  each  form  of  discourse;  second,  the  compositions  in  the 
scale  are  the  real  productions  of  children  and  not  "bilt  up" 
compositions  for  purposes  of  securing  gradation  in  the  scale; 
third,  each  scale  consists  only  of  six  types.  This  makes  it  an 
easy  matter  for  the  person  doing  the  grading  to  familiarize  him- 
self with  the  scales.  The  greatest  weakness  in  these  scales  lies 
in  the  fact  that  they  are  best  suited  for  eighth-grade  pupils. 

An  application  of  these  scales  reveals  the  fact  that  there  is 
but  slight  variation  in  the  grades  of  two  or  more  judges.  Indeed, 
the  variation  is  so  slight  that  a  single  investigator  can  feel  rea- 
sonably certain  that  his  grades  will  not  vary  widely  from  the 
median  of  several  judges. 

In  our  opinion,  the  Harvard-Newton  Scale  ranks  for  practi- 
cability alongside  the  Thorndike  and  Ayres  handwriting  scales, 
and  the  Courtis  Tests  in  Arithmetic  (Series  B).     It  has  the  real 
ring  to  it  and  will  doutless  have  a  wide  use. 
The  Courtis  Test  in  English 

Professor  S.  A.  Courtis  has  five  different  tests  in  English :  I, 
Handwriting  Test;  II,  English  Composition  Test;  III,  Spelling, 
Punctuation,  and  Grammar  Test;  IV,  Rates  of  Reading  and 
Writing  Test ;  and  V,  Rates  of  Reproduction  Test.  In  his  writing 
test  Mr.  Courtis  uses  four  groups  of  letters  with  five  in  a  group 
in  each  of  ten  lines.  Pupils  are  required  to  copy  these  as  rapidly 
as  they  can  and  maintain  a  good  quality.  The  speed  of  each 
child  is  recorded  and  the  quality  of  the  writing  is  mesured  by  the 


iThe  Harvard  Press,  50  cents 


25 

Thorndike  and  Ayres  scales.  Thru  the  co-operation  of  teachers 
Mr.  Courtis  hopes  to  establish  a  standard  test  in  both  speed  and 
quality  for  each  grade. 

Mr.  Courtis  bases  his  English  composition  standard  on  an 
original  story,  "Bessie's  Adventures",  parts  of  which  are  red 
while  other  parts  are  imagind.  His  method  of  determining  the 
relativ  merit  of  compositions  is  the  same  as  that  used  by  Dr.  Rice. 
Teachers  are  requested  to  group  these  original  stories  into  five 
groups,  on  the  basis  of  merit.  From  each  of  these  groups  they 
are  requested  to  select  a  sample  and  return  such  samples  to  him. 
In  this  way  he  hopes  finally  to  establish  a  standard  of  English 
abilities  in  the  several  grades,  similar  to  those  he  has  determind 
in  arithmetic.  His  other  English  investigations  follow  a  similar 
procedure.  All  of  his  tests  may  be  had  in  his  "Manual  of  In- 
structions for  Giving  and  Scoring  the  Courtis  Standard  Tests." 

Mr.  Courtis  has  not  presented  the  exercizes  in  his  English 
tests  so  clearly  and  attractivly  as  he  presented  those  of  his  arith- 
metic tests. 

The  Thorndike  Scale  for  Mesuring  Achievement  in  Drawing 

In  the  Teachers  College  Record  for  November,  1913,  Pro- 
fessor Thorndike  presented  a  scale  for  "The  Mesurement  of 
Achievement  in  Drawing".  In  reference  to  the  purpose  of  the 
scale  he  says :  "It  is  the  purpose  to  present  a  provisional  scale  by 
which  achievement  and  improvement  in  drawing  can  be  mesured 
with  somewhat  the  same  clearness,  exactness,  and  commensura- 
bility  as  achievement  and  improvement  in  lifting  weights." 

The  same  general  method  which  was  used  in  determining 
the  Thorndike  Handwriting  Scale  and  the  Hillegas  Composition 
Scale  was  employd  in  the  making  of  this  drawing  scale.  Forty- 
five  drawings  of  children  were  first  submitted  to  a  number  of 
critics  whose  ratings  reduced  the  number  to  a  series  of  fifteen 
drawings  graded  from  zero  up. 

This  series  of  fifteen  drawings  was  rated  by  376  persons,  of 
whom  sixty  were  artists  of  distinction,  eighty  were  supervizors 
of  art,  and  236  were  students  of  education  and  psychology. 

The  unit  of  the  scale  was  one  merit.  This  unit  is  "The  dif- 
ference of  merit  in  children's  drawings  which  75%  of  artists, 
teachers  of  art,  and  intelligent  judges  generally  can  distinguish, 


26 

and  which  25%  of  them  fail  to  distinguish."  The  drawing  lowest 
in  the  scale  was  judged  of  zero  merit.  The  difference  of  merit 
between  two  drawings  is  not  necessarily  a  unit  merit.  It  depends 
upon  the  relativ  number  of  judges  who  considerd  one  drawing 
better  than  the  other.  If  75%  of  the  judges  considerd  one  draw- 
ing superior  to  another  the  difference  in  quality  is  cald  a  unit 
merit.  If  less  than  75%  of  the  judges  distinguisht  a  difference  in 
merit  between  two  drawings  the  difference  between  the  two  is  less 
than  one  merit.  If  more  than  75%  of  the  judges  discernd  a 
difference  in  merit  the  difference  in  quality  was  markt  more 
than  one  merit.    The  following  is  the  determind  rating : 

Table  III 

Drawing  1=0      merit  Drawing  8=10.5  merit 

Drawing  2=2.4  merit  Drawing  9=11.8  merit 

Drawing  3=3.9  merit  Drawing  10=12.6  merit 

Drawing  4=5.7  merit  Drawing  11=13.5  merit 

Drawing  5=6.5  merit  Drawing  12=14.4  merit 

Drawing  6=7.8  merit  Drawing  13=16  merit 

Drawing  7=8.6  merit  Drawing  14=17  merit 

The  reader  should  see  the  drawings  in  the  Teachers  College 
Record,  which  accompany  these  merit  values. 

No  one  is  more  conscious  of  the  limitations  of  this  scale  than 
is  Professor  Thorndike.  In  spite  of  its  limitations  it  is  a  valua- 
ble contribution  to  experimental  education.  The  method  of 
attack,  the  care  employd  in  determining  differences  in  merit,  and 
the  scientific  attitude  of  the  author  in  the  whole  procedure  will 
have  a  wholesome  effect  upon  investigators.  It  is  as  practical  in 
determining  the  qualities  of  children's  drawing  as  are  the  writing 
scales  in  determining  the  quality  of  handwriting.  It  would  better 
meet  the  needs  of  the  schools  if  it  attempted  to  mesure  the  various 
aspects  of  children's  art  insted  of  a  single  aspect.  It  is  to  be 
hoped  that  it  will  be  followd  by  other  "drawing  scales"  which 
are  adapted  to  mesure  the  various  aspects  of  children's  drawings. 

The  Thorndike  Reading  Scale  A:  Visual  Vocabulary 

Thorndike's  Reading  Scale  A  for  visual  vocabulary  appeard 
in  the  Teachers  College  Record  for  September,  1914.  In  present- 
ing this  scale  Professor  Thorndike  states  that  there  are  four 
fases  of  reading  ability  which  should  be  mesured  :   "(1)  A  pupil's 


27 

ability  to  pronounce  words  and  sentences  seen;  (2)  a  pupil's  abil- 
ity to  understand  the  meaning  of  words  and  sentences  seen ;  (3)  a 
pupil's  ability  to  appreciate  and  enjoy  what  we  roughly  call 
'good  literature' ;  and  (4)  a  pupil's  ability  to  read  orally,  clearly, 
and  effectivly." 

The  following  scale  in  conjunction  with  the  silent  reading 
tests  perfected  by  both  Kelly  and  Gray,  given  later  in  this  report, 
is  an  adequate  mesurement  of  number  (2)  above.  Gray's  scale 
for  the  mesurement  of  oral  reading  provides  for  number  (1) 
above.  Professor  Thorndike  says  that  he  is  working  on  scales  to 
mesure  (3)  and  (4).  It  is  hoped  that  these  scales  will  soon  be 
developt. 

Thorndike  Reading  Scale  A  :    Visual   Vocabulary 

Write  your  name  here 

Write  your  age  here years months 

Look  at  each  word  and  write  the  letter  F  under  every  word  that 

means  a  flower. 
Then  look  at  each  word  again  and  write  the  letter  A  under  each 

word  that  means  an  animal. 
Then  look  at  each  word  again  and  write  the  letter  N  under  each 

word  that  means  a  boy's  name. 
Then  look  at  each  word  again  and  write  the  letter  G  under  each 

word  that  means  a  game. 
Then  look  at  each  word  again  and  write  the  letter  B  under  each 

word  that  means  a  book. 
Then  look  at  each  word  again  and  write  the  letter  T  under  each 

word  like  nozv  or  then  that  means  something  to  do  with  time. 
Then  look  at  each  word  again  and  write  the  word  GOOD  under 

every  word  that  means  something  good  to  be  or  do. 
Then  look  at  each  word  again  and  write  the  word  BAD  under 

every  word  that  means  something  bad  to  be  or  do. 

4.  camel,  samuel,  kind,  lily,  cruel 

5.  cowardly,  dominoes,  kangaroo,  pansy,  tennis 

6.  during,  generous,  later,  modest,  rhinoceros 

7.  claude,  courteous,  isaiah,  merciful,  reasonable 

8.  chrysanthemum,  considerate,  lynx,  prevaricate,  reuben 

9.  ezra,  ichabod,  ledger,  parchesi,  preceding 

10.  crocus,  dahlia,  jonquil,  opossom,  poltroon 

10.5  begonia,  equitable,  pretentious,  renegade,  reprobate 

11.  armadillo,  iguana,  philanthropic 


28 


The  Kansas  Silent  Reading  Test 

Dean  F.  J.  Kelly  of  the  School  of  Education,  University  of 
Kansas,  while  director  of  the  Training  School  in  the  State  Normal 
at  Emporia,  developt  and  standardized  The  Kansas  Silent  Read- 
ing Test.  This  test  will  appeal  to  practical  school  men.  It  is 
definit,  simple,  and  easily  presented.  The  results  can  be  quickly 
and  definitly  determind.  In  practicability  it  ranks  with  the 
Thorndike  and  Ayres  Handwriting  Scales,  and  Courtis  Arith- 
metic Tests  (series  B),  The  Harvard-Newton  Composition 
Scales,  Thorndike's  Reading  Scale  and  the  Ayres  Spelling  Scale. 

The  entire  test  consists  of  carefully  graded  groups  of  exer- 
cizes; one  for  the  primary  grades,  one  for  the  grammar  grades, 
and  one  for  the  high  school.  The  following  exercizes  are  chosen 
from  the  sixteen  exercizes  listed  in  the  test  for  grades  three, 
four,  and  five. 


Value 
2.1 


Value 
4.9 


No.  1 
Mary  is  older  than  Nellie,  and  Nellie  is  older  than 
Kate,    which  girl  is  older,  Mary  or  Kate? 


No.  9 
It  was  a  quiet,  snowy  day.  The  train  was  late. 
The  ladies'  waiting  room  was  dark,  smoky  and  close, 
and  the  dozen  women,  old  and  young,  who  sat  wait- 
ing impatiently,  all  lookt  cross,  low  spirited  or 
stupid. 

In  this  scene,  the  women  probably  kept  their 
wraps  on,  because  they  wisht  to  be  redy  to  take  the 
train.  Pretty  soon  the  station  agent  came  and  put 
more  coal  in  the  stove,  which  was  alredy  redhot  in 
spots.    Do  you  think  this  made  the  women  happier? 


29 


Value 
5.6 


No. 

10 

Below  are  three  lines. 

If  the  first  is 

the  short- 

.     est,  place  a  dot  above  it. 

If  the  last  line 

is  shorter     . 

put 

.     a  cross  above  the  longest. 

If  each  of  the 

other  1 

ines     . 

is  longer  than  the  last  line,  put  a  cross 

above 

the     . 

.     shortest  line. 

The  Gray  Reading  Tests 

These  tests  were  developt  by  Professor  William  S.  Gray, 
now  in  the  School  of  Education,  University  of  Chicago,  while  a 
graduate  student  at  Columbia  and  Chicago.  In  an  endevor  to 
determin  certain  facts  concerning  reading  achievement,  rather 
than  in  an  attempt  to  devize  a  test  per  se,  this  scale  was  workt  out 
by  Mr.  Gray.  The  exercizes  employd  consist  of  carefully  graded 
selections.  Those  for  the  oral  reading  test  increase  in  difficulty 
of  interpretation.  This  test  is  not  so  easily  operated  as  is  the 
Kansas  Silent  Reading  Test. 

The  oral  test  is  designd  to  mesure  abilities  in  pronunciation, 
omissions,  insertions,  substitutions,  and  repetitions.  The  silent 
test  is  intended  to  mesure  the  pupil's  ability  to  determin  the 
thought  essentials  in  a  series  of  reading  exercizes. 

Alredy  a  sufficiently  large  number  of  children  have  been 
tested  to  determin  a  pretty  safe  standard  of  the  median  abilities 
of  the  children  in  grades  three  to  eight  inclusiv.  It  is  to  be  hoped 
that  this  scale  will  be  put  in  a  suitable  form  and  soon  be  made 
accessible  to  teachers. 

The  Ayres  Spelling  Scale1 

A  scale  for  mesuring  ability  in  spelling  prepared  by  Dr. 
Ayres  was  determind  from  data  consisting  of  1,400,000  spellings 
by  7,000  children  in  84  cities  thruout  the  country.  The  words  in 
the  scale  are  1,000  in  number.  These  words  are  arranged  in  col- 
ums  on  the  basis  of  their  difficulty.    All  the  words  in  each  colum 


1  Single  copies  of  the  Ayres  Spelling-  Scale  and  of  the  Ayres  Handwriting 
Scale  may  be  had  for  5  cents  each,  by  addressing  the  Russell  Sage  Foundation.  New 
York  City. 


30 

have  practically  the  same  difficulty.  The  scale  shows  the  percent 
that  the  median  child  of  each  grade  should  make  on  each  colum 
of  words.  For  example,  the  median  child  in  the  third  grade 
should  spell  correctly  58%  of  the  words  in  colum  14.  The  median 
child  in  a  fourth  grade  should  spell  correctly  79%  of  the  words 
in  the  same  colum.  Median  abilities  are  indicated  in  like  manner 
for  the  other  grades. 

(The  practicability  of  this  scale  is  characteristic  of  Dr. 
Ayres'  contributions  to  the  science  of  education.)  It  is  very 
satisfactory  for  determining  the  spelling  abilities  of  children. 
Indeed,  it  is  quite  doubtful  if  there  will  be  any  improvement  upon 
this  scale  for  the  mesurement  of  spelling  abilities  in  the  near 
future. 

The  Composition  Method  of  Testing  the  Spelling  Abilities  of 

Children 

It  will  be  rememberd  that  both  Rice  and  Cornman  used  the 
composition  method  of  determining  the  spelling  abilities  of  chil- 
dren. The  abilities  as  shown  by  these  investigations  were  so  high 
that  practical  school  men  considerd  them  worthless  as  standards. 

The  high  grades  reported  by  both  Rice  and  Cornman  were 
due  to  the  methods  employd.  Rice  found  the  ratio  between  all 
of  the  words  speld  correctly  (including  duplicate  words)  and  the 
misspeld  words  (duplicate  misspeld  words  not  counted).  This 
method  produced  a  low  percentage  of  error.  Cornman  attempted 
to  correct  this  error  by  counting  all  duplicate  misspeld  words  as 
well  as  duplicate  words  which  were  speld  correctly.  As  is  evident 
this  method  slightly  increast  the  percentage  of  error  in  spelling. 

The  error  in  both  methods  resulted  from  the  fact  that  both 
Rice  and  Cornman  did  not  recognize  that  children  duplicate  a 
larger  proportion  of  words  which  they  can  spell  correctly  than  of 
words  which  they  misspell.  There  are  at  least  two  reasons  for 
this :  first,  there  is  a  nativ  tendency  to  use  freely  words  which  one 
is  confident  he  can  spell  and  to  avoid  the  use  of  words  difficult  to 
spell ;  second,  there  are  a  number  of  easily  speld  words  such  as  in, 
on,  and,  the,  so,  for,  is,  etc.,  which  make  up  the  major  portion  of 
the  duplicated  words. 

If  the  above  reasons  are  sound  it  is  evident  that  one's  spelling 
grade  is  raisd  by  increasing  the  number  of  repetitions  when 
mesured  by  the  Rice  and  Cornman  plans.  Since  children  neces- 
sarily repeat  a  large  number  of  simple  words  it  follows  that  the 
spelling  grades  of  children  will  be  too  high  when  tested  by  the 
Rice-Cornman  methods. 


31 

Because  I  believd  that  a  spelling  standard  based  upon  the 
composition  method  is  the  only  standard  that  is  reliable  for  daily 
use  in  the  school  room,  I  began  to  gather  data  in  the  spring  ot 
1915,  for  the  purpose  of  determining  a  composition  standard  of 
spelling  which  is  free  from  the  manifest  errors  in  Rice's  and  Corn- 
man's  conclusions.  Instructions  were  sent  to  a  number  of  super- 
intendents and  principals  who  had  previously  manifested  a  wil- 
lingness to  assist  in  this  investigation.  So  far  thirteen  schools 
hav  reported.  These  instructions  were  to  the  effect,  first,  that 
all  duplicate  words  and  the  words  /  and  a  in  the  compositions 
should  be  crost  out ;  second,  that  of  the  words  not  crost  out  the 
ratio  of  the  words  speld  correctly  to  those  misspeld  should  be 
exprest  in  percent. 

Thirteen  schools  r'eturnd  papers  properly  markt.  The  re- 
sults from  eleven  of  these  schools  hav  been  tabulated,  and  the 
median  ability  for  each  grade  determind  as  follows: 

Table  IV 
Median  Spelling  Abilities  of  Eleven  Schools  as 
Determind  by  this  Composition  Method : 
3rd  grade  4th  grade  5th  grade    6th  grade  7th  grade  8th  grade 
91%        93.6%        95.5%  96.6%        96.9%        98.2% 

The  Median  Spelling  Abilities  Reported  by  Cornman : 
3rd  grade  4th  grade  5th  grade    6th  grade  7th  grade  8th  grade 
94.6%        96.5%        97%  98.1%        98.9%        99.5% 

A  comparison  of  the  two  tables  reveals  a  decided  difference 
in  the  two  results.  This  is  greater  than  the  tables  indicate.  Our 
instructions  were  to  give  the  test  to  the  best  school  in  the  city. 
These  instructions  were  given  with  the  thought  that  a  standard  to 
be  of  real  value  should  represent  abilities  determind  under  most 
favorable  circumstances  rather  than  under  mediocre  circum- 
stances. It  is  quite  probable  that  the  median  abilities  shown  in 
our  report  (Table  IV)  ar  decidedly  higher  than  medians  which 
would  be  obtaind  from  testing  all  of  the  children  in  the  cities 
where  these  schools  were  located. 

Table  IV  is  but  a  tentativ  report  of  this  investigation.  Addi- 
tional data  and  a  more  critical  examination  of  the  various  papers 
reported  are  necessary  before  the  reliability  of  these  results  can  be 
depended  upon.  It  is  very  probable,  however,  that  additional 
data  will  show  but  slightly  changed  median  abilities  of  the  several 


32 

grades  with  the  single  exception  of  the  third  grade.  There  is 
evidence  that  this  mark  is  too  low. 

There  is  a  prevailing  notion  abroad  in  educational  circles 
that  objectiv  standards  can  be  used  only  in  mesuring  the  skills 
of  pupils.  Persons  who  hold  this  notion  argue  that  since  these 
standards  mesure  skill  only,  the  results  of  such  mesurements  are 
of  little  value  in  determining  the  relativ  merit  of  teachers.  They 
further  argue  that  since  the  objectiv  standards  mesure  form  and 
not  content,  any  markt  attention  given  to  this  sort  of  mesurement 
will  result  in  an  over  emfasis  of  form  at  the  expense  v.of  content. 

These  arguments  are  based  upon  two  fallacies :  ( 1 )  It  is 
fallacious  to  assume  that  only  skill  can  be  mesured  by  the  ob- 
jectiv standard.  It  is  true  that  standards  for  the  mesurement  of 
skill  were  determind  first.  Standards  for  the  mesurement  of 
abilities  to  reason,  to  enjoy,  and  to  appreciate  are  following.  The 
Kansas  Silent  Reading  Test  and  the  Gray  Silent  Reading  Test 
are  both  standards  of  the  latter  type.  (2)  It  is  fallacious  to  assume 
that  attention  to  the  mesurement  of  such  abilities  as  the  funda- 
mentals in  arithmetic,  handwriting,  spelling,  form  in  reading, 
etc.,  will  result  in  an  over  emfasis  of  the  formal  subjects  to  the 
detriment  of  the  content  subjects.  This  would  not  be  fallacious 
were  it  not  true  that  grades  far  above  the  median  indicate  an  un- 
due emfasis  upon  the  subject  taught  and  consequently  are  a  mark 
of  poor  teaching. 

It  must  be  rememberd  that  an  application  of  a  standard  test 
will  detect  an  undue  emfasis  of  some  particular  subject-matter 
as  well  as  an  insufficient  emfasis  of  it. 

It  is  excedingly  important  that  the  interest  of  the  school  men 
of  the  State  of  Illinois  be  elicited  in  support  of  a  movement  to 
apply  the  objectiv  standard  more  generally.  We  should  have 
Illinois  standards  for  the  various  abilities  which  can  now  be 
definitly  mesured. 

I  would  suggest  that  a  bureau  be  establisht  by  the  State 
Teachers  Association,  or  in  connection  with  the  Department  of 
Public  Instruction,  the  State  Normal  Schools,  or  the  School  of 
Education  at  the  University  of  Illinois,  for  the  direct  purpose  of 
preparing  and  distributing  these  tests  and  for  the  purpose  of 
tabulating  and  distributing  the  results.  Any  one  of  these  branches 
of  the  public  school  system  of  the  state  should  be  and,  I  believ,  is 
willing  to  undertake  this  work  if  it  is  the  wish  of  the  school  men 
of  the  state  to  have  it  done. 


3  0112  105727298 


